Median-LVQ for Classification of Dissimilarity Data based on ROC-Optimization
نویسندگان
چکیده
In this article we consider a median variant of the learning vector quantization (LVQ) classifier for classification of dissimilarity data. However, beside the median aspect, we propose to optimize the receiver-operating characteristics (ROC) instead of the classification accuracy. In particular, we present a probabilistic LVQ model with an adaptation scheme based on a generalized ExpectationMaximization-procedure, which allows a maximization of the area under the ROCcurve for those dissimilarity data. The basic idea behind is the utilization of ordered pairs as a structured input for learning. The new scheme can be seen as a supplement to the recently introduced LVQ-scheme for ROC-optimization of vector data.
منابع مشابه
Median Variants of LVQ for Optimization of Statistical Quality Measures for Classification of Dissimilarity Data
We consider in this article median variants of the learning vector quantization classifier for classification of dissimilarity data. particularly we are interested in optimization of advanced classification quality measures like sensitivity, specificity or the Fβmeasure. These measures are frequently more appropriate than simple accuracy, in particular, if the training data are imbalanced for t...
متن کاملPatch Processing for Relational Learning Vector Quantization
Recently, an extension of popular learning vector quantization (LVQ) to general dissimilarity data has been proposed, relational generalized LVQ (RGLVQ) [10, 9]. An intuitive prototype based classification scheme results which can divide data characterized by pairwise dissimilarities into priorly given categories. However, the technique relies on the full dissimilarity matrix and, thus, has squ...
متن کاملRelational Extensions of Learning Vector Quantization
Prototype based models offer an intuitive interface to given data sets by means of an inspection of the model prototypes. Supervised classification can be achieved by popular techniques such as learning vector quantization (LVQ) and extensions derived from cost functions such as generalized LVQ (GLVQ) and robust soft LVQ (RSLVQ). These methods, however, are restricted to Euclidean vectors and t...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملخوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کامل